Efficient Bayesian Clustering for Reinforcement Learning
نویسندگان
چکیده
A fundamental artificial intelligence challenge is how to design agents that intelligently trade off exploration and exploitation while quickly learning about an unknown environment. However, in order to learn quickly, we must somehow generalize experience across states. One promising approach is to use Bayesian methods to simultaneously cluster dynamics and control exploration; unfortunately, these methods tend to require computationally intensive MCMC approximation techniques which lack guarantees. We propose Thompson Clustering for Reinforcement Learning (TCRL), a family of Bayesian clustering algorithms for reinforcement learning that leverage structure in the state space to remain computationally efficient while controlling both exploration and generalization. TCRL-Theoretic achieves near-optimal Bayesian regret bounds while consistently improving over a standard Bayesian exploration approach. TCRLRelaxed is guaranteed to converge to acting optimally, and empirically outperforms state-of-the-art Bayesian clustering algorithms across a variety of simulated domains, even in cases where no states are similar.
منابع مشابه
Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments
Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments by Miao Liu Department of Electrical and Computer Engineering Duke University
متن کاملEfficient Structure Learning in Factored-State MDPs
We consider the problem of reinforcement learning in factored-state MDPs in the setting in which learning is conducted in one long trial with no resets allowed. We show how to extend existing efficient algorithms that learn the conditional probability tables of dynamic Bayesian networks (DBNs) given their structure to the case in which DBN structure is not known in advance. Our method learns th...
متن کاملMultiagent Planning with Bayesian Nonparametric Asymptotics
Autonomous multiagent systems are beginning to see use in complex, changing environments that cannot be completely specified a priori. In order to be adaptive to these environments and avoid the fragility associated with making too many a priori assumptions, autonomous systems must incorporate some form of learning. However, learning techniques themselves often require structural assumptions to...
متن کاملTransfer Learning for Reinforcement Learning with Dependent Dirichlet Process and Gaussian Process
The ability to transfer knowledge across tasks is important in guaranteeing the performance of lifelong learning in autonomous agents. We propose a flexible Bayesian Nonparametric (BNP) model based architecture for transferring knowledge between reinforcement learning domains. A Dependent Dirichlet Process Gaussian Process hierarchial BNP model is used to cluster different classes of source MDP...
متن کاملProbabilistic Reasoning through Genetic Algorithms and Reinforcement Learning
In this paper, we develop an efficient approach for inferencing over Bayesian etworks by using a reinforcement learning controller to direct a genetic algorithm. The random variables of a Bayesian network can be grouped into several sets reflecting the strong probabilistic correlations between random variables in the group. We build a reinforcement learning controller to identify these groups a...
متن کامل